Back

Analytical Chemistry

American Chemical Society (ACS)

Preprints posted in the last 7 days, ranked by how well they match Analytical Chemistry's content profile, based on 205 papers previously published here. The average preprint has a 0.12% match score for this journal, so anything above that is already an above-average fit.

1
Multi-region sampling of the human small intestine using an ingestible device

Fu, B.; DeSchepper, L. B.; Sun, J.; McKeithen-Mead, S. A.; Kapili, B.; Ochoa-Andersen, P.; Spencer, S. P.; Fardeen, T.; Ricardo, M.; El Kamari, V.; Sinha, S.; Relman, D. A.; Grembi, J. A.; Shalon, D.; Estrela, S.; Huang, K. C.

2026-06-10 gastroenterology 10.64898/2026.06.09.26353912 medRxiv
Top 2%
0.8%
Show abstract

The human small intestine (SI) plays a central role in nutrient processing, host-microbe interactions, and immune regulation, yet remains poorly characterized due to the lack of minimally disruptive sampling methods. Here, we present a protocol for deploying, recovering, and analyzing samples collected using an ingestible device that enables multi-region, lumen-targeted SI sampling during normal digestion. The device incorporates a ~30-cm collapsible tube wound into pH- or time-responsive layers that sequentially unfurl in situ, typically capturing three spatially ordered samples with high yield and reliable retrieval. This protocol outlines study design, participant handling, device recovery, contamination control, and standardized workflows for analyses, including cell quantification, culturomics, sequencing, and metabolomics. We further describe benchmarking approaches for evaluating spatial resolution and strategies for assay prioritization when sample volume is limiting. By reducing participant burden and facilitating integration with stool, saliva, and clinical metadata, this approach enables longitudinal and large-cohort studies linking SI microbial ecology and host physiology to human health.

2
Computational and Experimental Antibody Affinity and Diagnostic Accuracy Quantification of SARS-CoV-2 SD2 Major Disulfide Loop Analog

Pollo, B. A. L. V.; Perias, G. A.; Aguimatang, R. H.; Espiritu, A. P.; Ching, D.; Idolor, M. I.; King, R. A.; Climacosa, F. M.; Caoili, S. E.

2026-06-08 infectious diseases 10.64898/2026.06.05.26353587 medRxiv
Top 2%
0.8%
Show abstract

Introduction: Synthetic oligopeptides provide a rapid and cost-efficient approach to developing antibodies and diagnostics for emerging viral variants. Methods: This study computationally and experimentally characterized a synthetic peptide analog of the SARS-CoV-2 spike subdomain 2 major disulfide loop (SD2MDL), designated S621 (CPVAIHADQLTPTWRVYSTC). Binding affinity was computationally estimated using the Heuristic Affinity Prediction Tool for Immune Complexes (HAPTIC), while experimental validation was performed using enzyme-linked immunosorbent assay (ELISA) with rabbit-derived antipeptide antibodies. Clinical diagnostic accuracy testing was done using plasma samples from RT-PCR-confirmed COVID-19 patients and pre-COVID-19 controls. Results: S621 demonstrated nanomolar binding affinity (Kdapp = 1.14 nM) and high avidity (3.67 nM), closely matching HAPTIC predictions (3.54 nM). Diagnostic evaluation yielded a sensitivity of 89.92% and specificity of 27.79%, corresponding to an overall accuracy of 71.79%. Discussion: These findings demonstrate that a single synthetic peptide derived from a conserved spike subdomain can function as a high-affinity surrogate for full-length antigens, supporting its potential application in rapid peptide-based immunodiagnostics.

3
Incremental Clinical Value of Single-Molecule Nanopore Sequencing in Thalassemia Testing: A Prospective Double-blind, Multicenter Study

Xiang, J.; Zhu, B.; Xu, H.; Chen, Y.; Sun, X.; xiang, r.; Zhao, Y.; Liu, W.; Zhang, L.; He, J.; liu, j.; Chen, Y.; Fan, Z.; Zhang, H.; Tan, J.; Pang, L.; Shi, L.; Kong, Y.; Cai, A.

2026-06-09 hematology 10.64898/2026.06.09.26354559 medRxiv
Top 3%
0.7%
Show abstract

Background Thalassemia is one of the most common monogenic disorders worldwide, current screening strategies combining hematological testing with molecular assays still carry a risk of missed diagnoses and undesirable efficiency, particularly for complex structural variants and rare mutations. Methods In this prospective double-blind, multicenter cohort study of 3,842 participants (3,362 pregnant women and 480 male partners), we conducted a head-to-head comparison to systematically evaluate the incremental clinical value and detection performance of single-molecule nanopore sequencing in thalassemia (SMITH) against conventional hematological testing and next-generation sequencing (NGS). Findings The overall concordance rate between NGS and SMITH was 98.6% (3789/3842). The discrepant cases (n=53) were directly attributed to the superior detection capabilities of SMITH, which successfully identified complex structural rearrangements-including 45 -globin gene triplications and four HK alleles-that were missed by NGS. Furthermore, SMITH accurately detected four rare variants (c.134_135insT/, c.-22(C>T)/, {beta}N/{beta}c.316-290delinsAGGGCAATAATTT and {beta}3.5 kb deletion/{beta}N ) and resolved ten trans and three cis configurations within the globin gene allele. Clinically, these technical advantages translated to a 9.3% (5/54) increase in the detection rate of high-risk prenatal couples, effectively preventing one birth affected by moderate-to-severe thalassemia. Additionally, SMITH corrected a diagnostic discrepancy in one case (HK vs. -3.7), sparing the couple from an unnecessary invasive procedure. Interpretation Our findings demonstrate that SMITH provides a powerful platform for resolving globin gene rearrangements, detecting rare variants, and enabling direct haplotype phasing. By effectively eliminating diagnostic blind spots, SMITH is expected to become an optimal method for thalassemia prevention programs. Funding This study was supported by Chinese National Natural Science Foundation Projects 81760037 and 82271894.

4
Cytoplasmic staining of T cell receptor components enables efficient assessment of lineage and clonality in surface CD3-negative T cell neoplasms

Wilk, A. J.; Gitana, G.; Oak, J.

2026-06-04 pathology 10.64898/2026.06.02.26354783 medRxiv
Top 4%
0.4%
Show abstract

Flow cytometry can establish T cell clonality by detecting a restricted expression pattern of the T cell receptor (TCR) {beta} constant region (TRBC), expressed in association with CD3. However, T cell neoplasms frequently lose surface expression of the CD3/TCR complex, posing a challenge to demonstrating T cell lineage and clonality. To address this challenge, here we present a 12-color flow cytometry panel, called cytoTCR, to characterize cytoplasmic expression of CD3/TCR complex components. We apply cytoTCR to 38 patient specimens with immunophenotypically abnormal T cell populations, demonstrating this approach can efficiently establish T cell lineage and clonality in challenging T cell neoplasms that have lost surface CD3 expression. While we show that natural killer (NK)-lineage neoplasms can express cytoplasmic CD3 at similar levels to T cells, we show that absent expression of cytoplasmic TCR components by mature lymphocytes can help confirm NK cell lineage. We demonstrate that cytoTCR can detect cytoplasmic TRBC-restriction in challenging cases of null-phenotype anaplastic large cell lymphoma, which lack surface expression of pan-T cell antigens. In cases of T-lymphoblastic leukemia, cytoTCR shows that cytoplasmic TRBC expression matches the expected developmental stage of the leukemia. Finally, we use cytoTCR to characterize atypical cCD3-CD7- T cells in a patient with a history of T-lymphoblastic leukemia as well as recent CAR-T therapy, showing that this atypical population is polytypic and represents CAR-T product rather than residual disease. Our study presents a broadly applicable flow cytometric approach to simultaneously assess T cell lineage and clonality in suspected T lineage populations with absent surface CD3 expression.

5
Impact of Early Treatment on Symptom Improvement and Procedural Events among Men with BPH and Bothersome Lower Urinary Tract Symptoms: A Contemporary Analysis of the American Urological Association Quality (AQUA) Registry

Ernandez, J.; Najafi, A.; Roehrborn, C. G.; Lerner, L. B.

2026-06-10 urology 10.64898/2026.06.08.26355194 medRxiv
Top 4%
0.3%
Show abstract

PURPOSE: As the armamentarium of BPH therapies continues to expand, it remains imperative to maximize patient satisfaction and minimize decisional regret. We sought to determine the impact of time from BPH diagnosis to index treatment on symptom improvement and subsequent procedural events. MATERIALS AND METHODS: We queried the American Urological Association Quality Registry for men [&ge;] 40 years old with BPH, available IPSS data, and no receipt of prior BPH treatment. Index treatment included medication, surgery, or minimally invasive surgical therapy (MIST). Outcomes included IPSS over 3 years of follow-up, change in percentage of mild lower urinary tract symptoms (LUTS) by 3 months, and time to procedural event. Patients were stratified by time from index diagnosis to treatment by <12 months, 1-3 years, and >3 years. Outcomes were compared across time-to-treatment cohorts with appropriate statistical tests with p < 0.05 as significant. RESULTS: 43,919 patients met criteria with 19,642 pursuing treatments. Patients pursued treatment at comparably lower baseline IPSS compared to prior prospective series. Patients undergoing surgery and MIST had significantly higher baseline IPSS, while medical comorbidities were significantly more common among men initiating pharmacotherapy. Early surgery and MIST were associated with significant improvement in IPSS within 6-12 months and an increase in mild LUTS by 3 months. All forms of early treatment were associated with delayed time to procedural events, including catheterization and fulguration. CONCLUSIONS: Early procedural intervention for BPH is associated with early symptom improvement and delayed time to procedural events among real-world, contemporary practice.

6
Development of a Novel Blood-Based Assay for Brain-Derived Tau and Its Validation in Traumatic Brain Injury

Balogun, W. G.; Zeng, X.; Nafash, M. N.; Sehrawat, A.; Shi, R.; Svirsky, S. E.; Okonkwo, D. O.; Puccio, A. M.; Karikari, T. K.

2026-06-10 neurology 10.64898/2026.06.05.26354965 medRxiv
Top 4%
0.2%
Show abstract

Brain-derived tau (BD-tau) is an emerging blood-based biomarker for neurodegeneration, yet there are currently limited well validated BD-tau assays available for research and clinical use. To enhance access to this vital biomarker for neurological disorders including traumatic brain injury (TBI), we developed a novel blood-based immunoassay for BD-tau on the ultra-sensitive Quanterix HD-X platform using Single Molecule Array technology. Analytical validation assessed dilution linearity, specificity, precision, detection limits, and spike recovery, each recording robust metrics in agreement with international expert recommendations. The assay demonstrated robust validation metrics, achieving between-run stability of 95% when analyzing aliquots from six independent plasma and serum samples across five analytical runs. It also showed strong dilution linearity when diluted four-fold and achieved over 90% recovery when spiked with cerebrospinal fluid. Next, we evaluated the clinical utility of the assay in cohorts of individuals with traumatic brain injury (TBI), where strong performances were recorded whether using the 2-step or 3-step assay formats ({rho}= 0.94; p < 0.0001). Furthermore, plasma BD-tau distinguished samples from TBI patients based on time from injury and severity (AUC=0.93). Plasma BD-tau differentiated between favorable and unfavorable functional outcomes in the acute-severe group. Our findings underscore the significant potential of the BD-tau assay as a biomarker for TBI in the severe phase.

7
A liquid biopsy-centered, pan-cancer, open next generation sequencing panel to support clinical decision-making (LION panel)

Feierabend, S.; Künstner, A.; Forster, M.; Helbing, T.; Gebauer, N.; Gemoll, T.; Axt, F.; Nimmagadda, S. C.; Ranganathan, L.; Schwandt, J.; Heber, M.; Szymczak, S.; Hohensee, I.; Fliedner, S. M. J.; Scherer, F.; Oberländer, M.; Derer-Petersen, S.; Busch, H.; von Bubnoff, N.; Dazert, E.

2026-06-08 oncology 10.64898/2026.06.05.26354976 medRxiv
Top 8%
0.1%
Show abstract

Cancer treatment has shifted toward personalized therapy based on molecular profiling, particularly in advanced disease. Existing circulating tumor DNA panels are often broad, generating many non-actionable variants and incurring costs that limit routine use in molecular tumor boards. We developed and validated a manufacturer-independent, 109-gene liquid biopsy-centered pan-cancer open next generation sequencing panel (LION panel), combined with an in-house bioinformatic pipeline to support clinical decision-making. A total of 87 samples were analyzed, including 17 reference samples, 21 healthy blood donor controls, and 49 patient samples including nine tumor entities. The LION panel achieved 92% sensitivity and 99% specificity in reference samples, with high concordance to digital droplet PCR (r = 0.99). It detected variant allele frequencies as low as 0.05% (tumor-informed) and 0.5% (tumor-uninformed). Clinical concordance reached 82% with blood-based digital droplet PCR and 75% with whole exome tissue sequencing. In representative cases, variant dynamics correlated with disease progression and revealed additional targetable variants. Overall, the LION panel supports clinical decision-making by enabling identification of targetable variants, disease monitoring, and detection of treatment resistance, particularly when tumor tissue is unavailable.

8
Comparison of the Mini Parasep SF, ParaPak SpinCon, and Paradevice fecal filtration and concentration devices for microscopic and AI-assisted detection of intestinal parasites

Morris, H.; Pritt, B. S.

2026-06-04 infectious diseases 10.64898/2026.06.02.26354769 medRxiv
Top 8%
0.1%
Show abstract

Effective filtration and concentration of stool specimens is an essential pre-analytical step for reducing fecal debris and improving organism recovery using microscopy-based ova and parasite (O&P) examination. This study evaluated three commercially available fecal sedimentation-based filtration/concentration systems, ParaPak SpinCon (Meridian Bioscience), Mini Parasep SF (Apacor), and the newly-available ParadeviceReingenuity), for qualitative parasite detection and workflow logistics using conventional and artificial intelligence (AI)-assisted microscopy. Forty clinical stool specimens (20 parasite-positive and 20 parasite-negative) were processed with the 3 devices, and the resultant 120 wet mount and 120 trichrome stained smear preparations were examined using conventional microscopy. Trichrome-stained slides were also scanned at 40x magnification using a Hamamatsu NanoZoomerS360 flatbed digital slide scanner and images were analyzed using the Techcyte Fusion Human Fecal Trichrome AI algorithm. Positive and indeterminate digital findings were confirmed by conventional glass slide microscopy. Slides and digital images were reviewed in a blinded manner. Concordance was assessed among the 360 initial evaluations (microscopy and AI-assisted), and discrepant parasitology results were resolved through re-review and specimen reprocessing as needed. Final qualitative agreement across slide/image evaluations using all three concentration systems was 100%. Minor discrepancies in protozoan and white/red blood cell detection/identification were noted in 5 and 7 cases, respectively, and likely reflected sampling and observer variability. While the three concentration systems produced equivalent qualitative results, the Paradevice and Mini Parasep SF offered the most streamlined workflows. These findings support the Paradevice and Mini Parasep SF as efficient, analytically equivalent systems that are compatible with traditional and AI-assisted O&P workflows.

9
Analytical Centralization of Health Expenditure at the National Administrator of Health System Resources: Architecture, Data Quality, and Operational Performance of the ADRES Health System Analytics Platform, Colombia

Garavito Jimenez, D. A.; Bello Angulo, D. E.; Mejia Lemus, L. T.; Chipatecua, D.; Fula, D. D.; Perez-Rubiano, S.; Martinez, F. L.; Bohorquez Pinzon, J. C.

2026-06-10 public and global health 10.64898/2026.06.08.26355159 medRxiv
Top 8%
0.1%
Show abstract

Between 2024 and 2025, Colombia universalized the Electronic Health Invoice with embedded Individual Health Services Delivery Records (RIPS -- Registro Background Between 2024 and 2025, Colombia universalized the Electronic Health Invoice with embedded RIPS records (FEV-RIPS) as the standard for financial and clinical data exchange. ADRES -- the entity responsible for administering the resources of Colombia's General Social Security Health System -- faced the challenge of processing information from multiple heterogeneous sources generated by more than 55,000 healthcare providers. Health systems in high-income countries converge clinical-financial data in consolidated platforms; Colombia started from a fragmented architecture with incompatible historical sources, no cross-database standardization, and no centralized analytical infrastructure until 2023. Objective We describe the design, technical challenges of integrating heterogeneous data, and operational performance of the analytical infrastructure built by ADRES to centralize large-scale processing of Colombian health system information, and derive transferable lessons for health system resource administrators in Latin America facing equivalent digitalization mandates. Methods Technical-descriptive report based on operational metrics from the ADRES Azure/Databricks environment during January-November 2025. We report indicators of data volume, processing speed, computational capacity, concurrent use by functional group, and governance structure. The architecture integrates VPN connectivity with MinSalud, automated processing of multiple formats (XML, relational tables, flat files), and a medallion data lake (Bronze/Silver/Gold). Data quality challenges include structural inconsistencies across sources, coding incompatibilities (municipalities, dates, diagnoses), format heterogeneities in unstructured data, and absent technical documentation. Results The platform manages 21 catalogs, 1,183 tables, and over 110,645 million stored records, with cumulative production exceeding 1 trillion processed records. It executes queries on 100 billion records in ten seconds using clusters of up to 32 TB RAM and 4,096 vCPU. During September-October 2025, monthly query peaks reached 78,028 across eleven functional groups. Integration required Python/PySpark parsers for variable-depth XML, equivalence tables for incompatible municipality codes, cleaning routines for extreme dates used as nulls (1900-01-01, 9999-12-31), and transformation logic bridging classic RIPS and FEV-RIPS. The platform supported econometric analyses, judicial mandate responses, and public interactive dashboards. Conversational AI integration (Genie, Copilot) extends analytical access to users without SQL knowledge. Conclusions ADRES built in one year an analytical infrastructure that provides, to our knowledge, the first published documentation of the systemic technical challenges of integrating heterogeneous data sources in a middle-income social security health system. Centralizing health system information at national scale is technically feasible under public institutional constraints -- but requires solving cross-source standardization problems the implementation literature does not document with quantitative precision. The derived lessons are transferable to health system resource administrators in Latin America facing equivalent challenges.

10
Large Language Models in Healthcare Simulation Education: A Bibliometric Analysis with AI-Assisted Screening

Pears, M.; Wadhwa, K.; Payne, S. R.; Konstantinidis, S. T. H.; Biyani, C. S.

2026-06-04 urology 10.64898/2026.06.02.26354722 medRxiv
Top 9%
0.0%
Show abstract

Large language models (LLMs) such as ChatGPT are rapidly reshaping healthcare education and simulation-based training in non-technical skills (NTS), yet no bibliometric analysis has mapped this landscape. We searched seven open-access databases (OpenAlex, PubMed, Europe PMC, Crossref, Semantic Scholar, CORE, DOAJ) for English-language publications from January 2020 to March 2026. From 100,277 initial records, a sequential keyword funnel yielded 830 candidate papers, which were screened by 83 independent Claude Sonnet 4.6 AI agents applying pre-specified inclusion criteria (PRISMA-trAIce compliant; Cohen's kappa = 0.86 pre-reconciliation, 1.0 post-reconciliation). The final AI-verified corpus comprised 551 papers with a compound annual growth rate of 109%, contributions from 2,398 authors across 279 journals in 58 countries, and an h-index of 41. ChatGPT dominated the model landscape (46% of papers), with open-source models virtually absent. Virtual patient chatbots were the leading simulation modality (106 papers). Among NTS domains, communication (145 papers) and decision-making (135 papers) were most studied, whereas teamwork, leadership, situational awareness, and crisis resource management were markedly underrepresented. Only 6 urology-relevant papers were identified, none examining LLM integration within boot camp training formats. The field is growing at extraordinary pace but remains concentrated in a narrow range of NTS domains and a single proprietary model. Critical gaps persist in team-based skills training, open-source model evaluation, and specialty-specific simulation. AI-assisted bibliometric screening using multiple independent agents is feasible, reliable, and scalable, offering a replicable methodology for mapping fast-evolving research fields.

11
Beyond event-rate enrichment: proteomic risk scores for mechanism-aware prevention trial design

Fieggen, J.; Simond, G.; Segal, B. M.; Noori, A.; Thakurta, A.; Butler, C. C.; Clifton, D. A.; Clifton, L.

2026-06-10 health informatics 10.64898/2026.06.09.26355266 medRxiv
Top 9%
0.0%
Show abstract

Background. Blood-based biomarkers are increasingly proposed for identifying high-risk individuals before clinical disease and for making prevention-oriented trials more efficient. Prognostic enrichment can increase event rates, but trial efficiency also depends on whether the intervention effect is preserved in the enriched population. Methods. Using the UK Biobank Pharma Proteomics Project, we trained disease-specific proteomic risk scores (ProRS) from 2,916 plasma proteins with elastic-net Cox models. We compared ProRS, polygenic risk scores (PRS), and combined PRS--ProRS scores across ten incident diseases. We estimated cumulative incidence and theoretical two-arm time-to-event trial sample sizes across risk strata. To evaluate effect preservation, we examined six intervention-analogue exposure--outcome pairs spanning genetic (PCSK9/coronary artery disease, APOE/Alzheimer's disease, PPARG/type 2 diabetes, IL23R/Crohn's disease), behavioural (physical activity/all-cause mortality), and pharmacological (RAAS inhibitors versus calcium channel blockers/coronary artery disease) examples. Results. ProRS outperformed PRS for 9 of 10 diseases (median C-index 0.75 versus 0.61). ProRS and PRS were weakly correlated (median Pearson |r| = 0.04), and joint PRS--ProRS stratification identified groups with higher observed incidence than either score alone for several endpoints. In the top risk quartile, combined-score enrichment reduced theoretical required sample sizes by 32--74\% under a fixed 20\% relative hazard reduction. These gains were not always preserved when stratum-specific intervention-analogue effects were used. Effects were broadly preserved for APOE/Alzheimer's disease and physical activity/mortality. The PPARG/type 2 diabetes effect attenuated toward the null under all three score types, showing that event-rate enrichment does not guarantee effect preservation. For IL23R/Crohn's disease and the antihypertensive comparison, point estimates differed across score types -- preserved under polygenic but attenuated under proteomic enrichment -- but confidence intervals were wide and overlapping. Conclusions. Proteomic risk scores can identify high-event-rate populations for prevention-oriented trials, but event-rate enrichment alone is insufficient for trial design. Biomarker-guided enrichment should evaluate mechanism-specific effect preservation and may be preferable as a stratification or adaptive-design variable rather than as a restrictive eligibility criterion.

12
Title: Development of a Human Papillomavirus genotype-informed risk-stratification model to improve Cervical Cancer screening in resource-limited settings: a cross-sectional study

Kambou Kountchou, K. D. K. K.; Tommo Tchouaket, M. C.; Moko Fotso, L. G.; Fokou Bomgning, B. N.; Fippo Fitime, L.; Talom Teumadjou, A.; Routoube, M.; Efakika Gabisa, J.; Ngoufack Jagni Semengue, E.; Nka, A. D.; Kae, A. C.; Dobgima Pisoh, W.; Deutou, L.; Takou, D.; Fainguem, N.; Sosso, S. M.; Kamgaing Simo, R.; Yagai, B.; Tabola Fossa, L.; Perno, C.-F.; Colizzi, V.; Enow-Orock, G.; Fokam, J.; Terrinoni, A.; Kuiate, J.-R.

2026-06-10 pathology 10.64898/2026.06.06.26355059 medRxiv
Top 9%
0.0%
Show abstract

Background: In resource-limited settings, a critical bottleneck in cervical cancer prevention is the lack of practical strategies to triage high-risk human papillomavirus (HR-HPV)- positive women. Therefore, this study aimed to develop and internally validate a genotype-specific risk stratification model. Methods: A cross-sectional study enrolled 555 women in Cameroon. Data collection integrated cervical cytology and HPV genotyping using Abbott m2000rt and Sacace multiplex systems. An iterative modeling approach with bootstrap validation was used to develop the model and address model instability. HR-HPV genotypes were transformed into a hierarchical risk variable due to sparsity and integrated with significant predictors. The final model was translated into a scoring system, and the risk gradients and performances were evaluated at two thresholds. Data was analyzed using SPSS 27.0. Results: The mean age was 44.8 years, and the prevalence of HR-HPV was 26.5% (147/555). The final model, incorporating HPV categories, age, and tobacco, demonstrated moderate discriminative ability (AUC=0.702, 0.642-0.762) with a good calibration (Hosmer-Lemeshow {chi}{superscript 2}=4.05, p=0.399). The scoring system assigned women to risk groups based on their total scores which produced a clear monotonic risk gradient; the observed probability of high-grade lesions/cancer ranged from 15% (score 0) to >65% (score [&ge;]4). At a conservative threshold ([&ge;]4 points), 4.7% (26/555) of women were classified as high-risk, concentrating 46% (6/13) of cancers (positive predictive value[PPV]=58%) while a sensitive threshold ([&ge;]3 points) had 16.8% (93/555) high-risk, concentrating 77% (10/13) cancers (PPV=38%). Both thresholds maintained a high negative predictive value (>95%). Conclusion: This bootstrap-validated, risk-stratification tool is a proof-of-concept in resource limited settings that assigns HR-HPV-positive women to distinct management pathways using three variables. After refining through a longitudinal study and external validation, this scoring system can improve the efficiency of cervical cancer screening programs in low-resource settings.

13
Alcohol Consumption Patterns and Sociodemographic Correlates Among US Adults with Cardiovascular Disease: A Cross-Sectional Analysis of All of Us and NHANES

yang, q.; yu, j.; zhao, h.; zou, m.; sun, y.

2026-06-09 public and global health 10.64898/2026.06.06.26355052 medRxiv
Top 10%
0.0%
Show abstract

This cross-sectional study aimed to examine the prevalence of alcohol use and its sociodemographic correlates among adults with cardiovascular disease (CVD). We analyzed data from two large US cohorts: the All of Us Research Program (2017-2023) and the National Health and Nutrition Examination Survey (NHANES, 1999-2016). Both CVD diagnosis and past-year alcohol consumption were self-reported. Risky drinking was defined as exceeding moderate drinking or binge drinking (All of Us), or moderate/heavy drinking (NHANES). Multivariable logistic regression was used to exam associations with sociodemographic and lifestyle factors. Among 32,788 current drinkers with CVD in the All of Us cohort, 15% exceeded moderate drinking thresholds and 26% reported binge drinking. Older age, female sex, and higher socioeconomic status were inversely associated with risky drinking, while smoking was positively associated. In NHANES, moderate drinking rose from 47.3% to 57.2% and heavy drinking from 6.7% to 7.2%. Moderate/heavy drinking was positively associated with age <65 but inversely with age [&ge;]65. Higher education and income were linked to moderate drinking, while current smoking was strongly associated with heavy drinking. These results highlight the need to integrate holistic screening for alcohol use, tobacco use, and social context into routine cardiovascular care.

14
Multiplexed temporal SWCNT biosensor combined with convolutional autoencoding identifies ALS-specific serum protein corona signatures

Sirtori, R.; Lopez, R. M.; Li, H.; Liu, C.; Fisk, N.; Roxbury, D. E.; Fallini, C.

2026-06-08 neurology 10.64898/2026.06.08.26354966 medRxiv
Top 10%
0.0%
Show abstract

Amyotrophic lateral sclerosis (ALS) lacks a validated blood-based diagnostic, and the field is increasingly moving from single-molecule markers toward integrative, multi-component signatures. Here we present a liquid-biopsy strategy that transduces disease dependent serum-nanoparticle interactions into a learnable near-infrared spectral phenotype. A sensor array of twelve DNA-functionalized single-walled carbon nanotube (SWCNT) chiralities, functionalized with (GT)6 ssDNA coupled with a deep learning model was tested on serum from 20 ALS patients and 19 age- and sex-matched controls (n = 39, TargetALS). Our multiplexed sensor design (12 SWCNT chiralities) and data acquisition strategy based on excitation-emission matrices acquired at three timepoints (0, 6, 24 h) was conceived to maximize sensor carried information. Indeed, we show that the array generates partially independent temporal dynamics across chiralities governed primarily by tube diameter. To decode this multiplexed, time-resolved signal, we trained a dual-objective convolutional autoencoder that jointly optimizes reconstruction and classification, achieving 84.6% cross-validated accuracy (AUC = 0.87). Selected latent features were reproducible across an independent same-subject experimental batch and correlated with serum neurofilament light chain, linking the spectral phenotype to a clinically relevant neurodegeneration marker. Mass spectrometry supported a molecular basis for discrimination, revealing an ALS-biased protein corona enriched in adaptive-immune and inflammatory proteins. Together, these results establish proof of principle that time-resolved, multi-chirality SWCNT spectral sensing can compress complex serum composition into a reproducible near-infrared biomarker signature for ALS.

15
Influencers, not just adverts: social media influencer exposure and tobacco use among urban youth in Kampala and Nairobi - a comparative mixed methods study

Jawahar Kanth, J. S.; Anish, T. M. R.; Odhiambo, B.; Lwembawo, K. D.; Micheal, S.; Arinaitwe, J.; Nakiyingi, L.

2026-06-10 public and global health 10.64898/2026.06.06.26355037 medRxiv
Top 11%
0.0%
Show abstract

Tobacco control treaties were written for billboards and television, not for the people now selling lifestyles to young Africans. As mobile internet saturates East African cities, social media influencers have become an unmeasured channel, especially when it comes to tobacco promotion. We assessed the prevalence of tobacco use, its association with influencer exposure, and how urban youth interpret that exposure in two capitals with different tobacco laws. We conducted a comparative mixed-methods study among youth aged 18-29 years in Kampala, Uganda, and Nairobi, Kenya (January-August 2025), combining (i) a cross-sectional survey using systematic sampling at youth-dense venues (n=772), (ii) four online focus group discussions (FGDs; n=40), and (iii) content analysis of 30 tobacco-related posts from high-reach influencers (greater than 50,000 followers). We used chi-square tests and multivariable logistic regression, thematic analysis (Braun and Clarke), and descriptive engagement metrics. Ever tobacco use among urban youth in East Africa was 29.3% (226/772), similar in Kampala (30.7%) and Nairobi (28.0%; p=0.409). After adjustment, exposure to influencers promoting tobacco independently predicted ever use (adjusted odds ratio [aOR] 1.90, 95% confidence interval [CI] 1.29-2.82; p=0.001), alongside male sex (aOR 2.35) and age 26-29 years (aOR 1.99). Tertiary education (aOR 0.45) and never seeing tobacco content (aOR 0.26) were protective. Posts framed tobacco as aspirational lifestyle; 77% of sampled comments were positive and 47.5% expressed interest in trying the product. Influencer exposure behaved as a modifiable risk factor of a magnitude comparable to established demographic drivers. Tobacco control in the region must move from print-era advertising bans to platform governance, mandatory disclosure of paid promotion, and youth-led counter-marketing.

16
Effect of levodopa treatment on gait in older adults with mild parkinsonian signs

Pongmala, C.; Roytman, S.; van Emde Boas, M.; Vangel, R.; Rosano, C.; Bohnen, N.

2026-06-06 geriatric medicine 10.64898/2026.06.04.26354926 medRxiv
Top 11%
0.0%
Show abstract

Background Slow walking in older adults with mild parkinsonian signs (MPS) is a complex, multifactorial phenomenon arising from the cumulative burden of subclinical age-associated pathologies. This decline reflects age-associated neuronal loss in the dopaminergic system. A recent study suggests that levodopa treatment may enhance gait parameters. The goal of this small pilot study is to explore the effect of levodopa treatment on slow walking gait in older adults with MPS. Method This study was a randomized, placebo-controlled clinical pilot trial. Slow walking older adults without clinical evidence of PD were recruited and randomized into 2 groups (active treatment group or placebo control group). Participants in the active group were pre-treated with carbidopa for three days, followed by carbidopa-levodopa for seven days. Spatiotemporal gait parameters were evaluated at baseline and post-intervention. Results Gait factor analysis identified three main factors explaining gait characteristics at baseline, which included gait efficiency, gait rhythmicity, and gait turning.No effect of treatment was observed in the placebo group (p=0.111, p=0.616), no group difference was observed between the placebo and active group at baseline ({beta}=0.310, p=0.547), but a strong trend for a treatment-related increase was observed in the active treatment group ({beta}=0.506, p=0.076). Conclusion Our preliminary data suggest that sustained levodopa treatment (one week) in conjunction with carbidopa pre-treatment and concomitant carbidopa supplementation is feasible in slow walking older adults with MPS. Moreover, the data indicate potential efficacy, showing improvements in cadence, and step durations.

17
An AI-assisted feasibility evaluation of three photoplethysmography-derived microvascular reactivity signals in MIMIC-IV-WDB v0.1.0

Landry, T. C.; Kim, Y.

2026-06-06 health informatics 10.64898/2026.06.03.26354863 medRxiv
Top 11%
0.0%
Show abstract

Background. Capillary refill time, an examiner-dependent bedside test of distal microvascular perfusion, has become a resuscitation target in septic shock,1,2,3,4 motivating a continuous surrogate computed from the photoplethysmogram (PPG, the optical waveform the pulse oximeter on every ICU patient already records).5,6,7,8 Objective. We attempted three PPG-derived candidate measures on the MIMIC-IV Waveform Database (MIMIC-IV-WDB v0.1.0) and asked, by inspecting randomly drawn examples, whether each captured its intended physiology before any downstream modeling. Methods. MIMIC-IV-WDB v0.1.09 was linked to MIMIC-IV.10 The signals were a cuff-anchored perfusion-index recovery (reactive hyperemia when the cuff shares an arm with the probe), a slow Mayer-wave-band power ratio of the perfusion index (sympathetic vasomotor tone), and a per-beat diastolic exponential decay time constant (a refill-like recovery time). For each signal we drew 10 random examples at a fixed seed and checked them against a checklist fixed in advance. Each was read by the author and, separately, by MedGemma 1.5, a multimodal medical language model run locally. A synthetic test with a known time constant checked the third signal. Results. The cuff-anchored signal showed the expected occlusion-reperfusion shape on 268 of 6,236 evaluable cuff cycles (4.30%) in 15 of 19 patients, consistent with opposite-limb placement of the probe and cuff. The slow-band ratio returned a stable cohort value, but a clear, stationary peak appeared in only4 of 10 random windows. The per-beat fit met its goodness-of-fit threshold in 10 of 10 beats, yet a cardiac-frequency heuristic flagged a possible fit on the heart-rate oscillation in 7 of 10, and in 5 of 17 patients the time constant lay where an exponential is indistinguishable from a straight line. A 0.5Hz high-pass pre-filter implanted its own approximately 318 ms time constant regardless of truth. The language model tracked the human on clear positives but reported the pattern present on every call it returned, never absent. Conclusions. Two of the three candidate signals did not reflect their intended physiology in most examples, and the third was constrained by sensor placement. Inspecting a few random raw inputs against a checklist written in advance is an inexpensive upstream check before downstream inference on PPG-derived microvascular signals.

18
From Charting Burden to Workflow Signal: Retrospective Validation of Documentation-Density Measures for ICU Complexity and Long-Stay Risk

Collier, A.

2026-06-06 health informatics 10.64898/2026.06.04.26354922 medRxiv
Top 11%
0.0%
Show abstract

Background Electronic health record documentation patterns may reflect workflow complexity, monitoring intensity, and operational strain in intensive care settings. However, documentation-derived features can be sensitive to local documentation culture, data capture systems, and outcome definitions. Retrospective validation across multiple datasets is therefore needed before these signals are used in workflow intelligence or clinical AI governance tools. Objective To evaluate whether documentation-density and documentation-timing features show reproducible retrospective signal for ICU workflow complexity and long-stay proxy outcomes across de-identified critical care datasets, while distinguishing workflow and long-stay associations from unsupported claims about mortality prediction, burden reduction, or deployment readiness. Methods We synthesized retrospective validation results from de-identified ICU and workflow datasets generated through a prespecified documentation-density validation program. Feature families included Documentation Burden Score style features, Shift-End Documentation Rate style features, documentation reliability style metadata, and all-documentation feature sets where available. Outcomes included long ICU length of stay proxies, mortality where available, and workflow proxy endpoints. Models compared baseline feature sets with enhanced models containing documentation-density or workflow features. Performance was summarized using area under the receiver operating characteristic curve, Brier score where reported, delta AUROC, bootstrap confidence intervals where reported, and label-shuffle controls where available. Results The strongest external long-stay proxy evidence came from the NWICU chartevents analysis, which included 28,612 ICU stays, 20,267 stays with chart events, and 9,619,759 chart events. For ICU length of stay greater than the median, baseline AUROC was 0.5252. Enhanced AUROC was 0.9512 for Documentation Burden Score features, 0.9214 for Shift-End Documentation Rate features, 0.8470 for documentation reliability style features, and 0.9517 for all documentation features. Corresponding label-shuffle enhanced AUROCs were near random, ranging from 0.4897 to 0.5064. For ICU length of stay greater than the 75th percentile, baseline AUROC was 0.5155. Enhanced AUROC was 0.9433 for Documentation Burden Score features, 0.9194 for Shift-End Documentation Rate features, 0.8118 for documentation reliability style features, and 0.9427 for all documentation features, with label-shuffle enhanced AUROCs from 0.4836 to 0.4999. Additional retrospective support was observed in eICU workflow analyses, HiRID first-24-hour documentation-density analyses, MIMIC-IV HF ICU internal analyses, MIMIC-IV-Note metadata extensions, and nursing-chart or lab density proxy analyses. However, cross-institution discrimination transfer was weak without recalibration, and several analyses remained proxy validations rather than final clinical validations. Conclusions Documentation-density and documentation-timing features show promising retrospective signal for ICU workflow complexity and long-stay proxy outcomes, especially in NWICU chartevents and selected internal dataset-specific analyses. These findings support further preregistered, prospective, silent-mode validation of documentation-derived workflow intelligence. They do not establish prospective clinical performance, mortality reduction, clinician burden reduction, autonomous deterioration prediction, or deployment readiness.

19
Metatranscriptomics-Derived Disease Risk Scores as a Preventive, Diagnostic, and Treatment Support Tool

Hu, L.; Bass, M.; Patridge, E.; Molusky, M.; Antoine, G.; Vuyisich, M.; Banavar, G.

2026-06-06 genetic and genomic medicine 10.64898/2026.05.29.26354333 medRxiv
Top 11%
0.0%
Show abstract

Background: Chronic diseases and symptom syndromes often develop after prolonged biological changes that may precede formal diagnosis. RNA-based metatranscriptomics captures active microbial and human gene expression and may provide a functional layer for disease risk evaluation. To address this translational gap, we developed and validated a Disease Risk Score (DRS) framework that integrates metatranscriptome-derived pathway activity scores from stool, saliva, and blood samples, and evaluated its potential clinical utility as an adjunct risk-evaluation tool. Methods: DRS uses disease-specific sets of pathway activity scores derived from stool and saliva microbial functions, stool and saliva microbial taxa, and blood human gene expression. For each disease, 'not optimal' pathway scores are aggregated into a normalized cumulative odds ratio, or cOR, using score-level odds ratios, statistical significance, and literature-supported biological relevance derived from a Development Cohort of 22,369 individuals. A cOR [&ge;] 5 is defined as high risk. Performance is evaluated in an independent Validation Cohort of 15,908 individuals using self-reported diseases as the reference. Disease support requires both significant cOR separation between self-reported and not-reported (Cohen's d [&ge;] 0.2) and risk ratio enrichment of self-reported disease among individuals classified as high risk (95% CI of Risk Ratio > 1). Results: Of 20 initially evaluated diseases, 15 meet the prespecified validation criteria on the independent validation cohort: ADHD, anxiety, chronic fatigue syndrome, depression, GERD, hypertension, inflammatory bowel disease, IBS-C, IBS-D, insomnia, MASLD, obesity, obstructive sleep apnea, Sjogren's syndrome, and type 2 diabetes. Five selected clinical scenarios illustrate how DRS can support clinician-mediated decision making, including IBS subtype reclassification, improved diagnostic acceptance in IBS-D, personalized lifestyle counseling in MASLD and early type 2 diabetes, and diagnostic uncertainty in atypical GERD. Conclusions: DRS is a metatranscriptomics-based risk-stratification framework that aggregates active microbial and human pathway signals into interpretable disease-specific risk estimates across a wide range of disease conditions. Validation against self-reported disease labels in an independent cohort shows significant risk enrichment for each of 15 diseases. DRS is intended as an adjunct to clinical evaluation: a decision support tool in situations where routine care encounters uncertainty, delay, or low patient engagement. Future prospective studies using clinically adjudicated endpoints are needed to assess calibration and clinical outcomes.

20
BodyMAE: A Surface-Area Aware Masked Autoencoder for Body Composition Estimation from 3D Body Scans

Zheng, Y.; Feng, B.; Cheng, R.; Qiu, C.; Long, Z.; Vaziri, K.; Hahn, J.

2026-06-06 health informatics 10.64898/2026.06.04.26354925 medRxiv
Top 11%
0.0%
Show abstract

Accurate assessment of body composition is important to risk stratification and management of metabolic, musculoskeletal, and aging-related diseases, yet reference modalities such as Dual-energy X-ray absorptiometry (DXA) are costly and impractical for frequent monitoring. Commodity 3D body scans offer a low-cost, radiation-free alternative, but extracting meaningful and predictive shape features from scans remains challenging due to nonuniform point density, variable body size and cross-device differences. We introduce BodyMAE, a self-supervised, surface-area aware masked autoencoder for metric-scale 3D body scans. The pipeline integrates area-adjusted sampling, a long-range focused encoder, and a lightweight decoder regularized to promote locally uniform reconstructions. Trained and evaluated on 917 paired 3D body scans paired with clinical DXA reports, BodyMAE achieves strong accuracy on fat percentage (root-mean-square error (RMSE) 3.825 percentage points, R^2 0.908), fat mass (RMSE 3.694 kg, R^2 0.968), and lean mass (RMSE 3.608 kg, R^2 0.901), with competitive performance on bone mineral content (RMSE 0.284 kg, R^2 0.754).We also assess feature stability across pretrained baselines, finding higher retrieval accuracy for our representations (Top-1 90.131%). These results indicate that combining metric-aware sampling, long-range relational encoding, and local geometric regularization enables accurate body composition estimation from 3D body scans, as validated by comparisons to DXA-derived measurements.